Gene Ontology-based Similarity Measures for Gene Clustering and Knowledge Discovery

نویسندگان

  • Panos Pardalos
  • Win Phillips
  • Alkis Vazacopoulos
  • Theodore Trafalis
  • Jian Pei
  • James Keller
  • Jianbo Gao
چکیده

Computing with words and perceptions, or CWP for short, is a mode of computing in which the objects of computation are words, propositions and perceptions described in a natural language. Perceptions play a key role in human cognition. Humans-but not machines-have a remarkable capability to perform a wide variety of physical and mental tasks without any measurements and any computations. Everyday examples of such tasks are driving a car in city traffic, playing tennis and summarizing a book. One of the major aims of CWP is to serve as a basis for equipping machines with a capability to operate on perception-based information. A key idea in CWP is that of dealing with perceptions through their descriptions in a natural language. In this way, computing and reasoning with perceptions is reduced to operating on propositions drawn from a natural language. In CWP, what is employed for this purpose is PNL (Precisiated Natural Language.) In PNL, a proposition, p, drawn from a natural language, NL, is represented as a generalized constraint, with the language of generalized constraints, GCL, serving as a precisiation language for computation and reasoning, PNL is equipped with two dictionaries and a modular multiagent deduction database. The rules of deduction are expressed in what is referred to as the Protoform Language (PFL). Any measurement-based theory, T, may be generalized to a perception-based theory, Tp, by adding to T the capability to operate on perception-based information. Two generalizations that are of particular importance involve probability theory, PT, and decision analysis, DA. Conceptually, computationally and mathematically, PTp and DAp are significantly more complex than their measurement-based versions. In this instance, as in many others, complexity is the price that has to be paid to reduce the gap between theory and reality. Knowledge Base-Clustering of Multi-Class SVM for Genes Expression Analysis Theodore B. Trafalis , Budi Santosa and Tyrrell Conway School of Industrial Engineering, University of Oklahoma Department of Botany and Microbiology, University of Oklahoma

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی اسناد مبتنی بر آنتولوژی و رویکرد فازی

Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...

متن کامل

Mixture Model Adaptive Neural Network for Mining Gene Functional Patterns from Heterogeneous Knowledge Domains

Gene Ontology (GO) annotation and gene expression profiling have been two major approaches for system-wide analysis of gene functions. Current high-throughput sequence alignment and microarray technologies produce large volumes of noisy data. In the literature, numerous clustering methods have been studied for discovery of gene functional grouping based on either approach. But there is a lack o...

متن کامل

Gene Ontology Similarity Measures Based on Linear Order Statistics

The standard method for comparing gene products (proteins or RNA) is to compare their DNA or amino acid sequences. Additional information about some gene products may come from multiple sources, including the set of Gene Ontology (GO) annotations and the set of journal abstracts related to each gene product. Gene product similarity measures can be based on evaluating sets of descriptor terms fo...

متن کامل

A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool

SUMMARY Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is ...

متن کامل

A Comparative Study of Ontology Based Term Similarity Measures on PubMed Document Clustering

Recent research shows that ontology as background knowledge can improve document clustering quality with its concept hierarchy knowledge. Previous studies take term semantic similarity as an important measure to incorporate domain knowledge into clustering process such as clustering initialization and term re-weighting. However, not many studies have been focused on how different types of term ...

متن کامل

Clustering optimal de gènes fondé sur une mesure de similarité sémantique

In various application domains of knowledge extraction or information retrieval, objects are not represented as feature vectors in a vector space but as a pairwise similarity matrix. In molecular biology, such a similarity measure either captures the object structure (e.g. molecules, proteins as sequences of amino acids) or the semantics of their description (genes or diseases described with on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004